Trace-Driven Debugging of Message Passing Programs
نویسندگان
چکیده
In this paper we report on features added to a parallel debugger to simplify the debugging of message passing programs. These features include replay, setting consistent breakpoints based on interprocess event causality, a parallel undo operation, and communication supervision. These features all use trace information collected during the execution of the program being debugged. We used a number of different instrumentation techniques to collect traces. We also implemented trace displays using two different trace visualization systems. The implementation was tested on an SGI Power Challenge cluster and a network of SGI workstations.
منابع مشابه
Efficient Tracing for On-the-Fly Space-Time Displays in a Debugger for Message Passing Programs
In this work we describe the implementation of a practical mechanism for collecting and displaying trace information in a debugger for message passing programs. We introduce a trace format that is highly compressible while still providing information adequate for debugging purposes. We make the mechanism convenient for users to access by incorporating the trace collection in a set of wrappers f...
متن کاملThe Performance of Two Tracing and Replay Algorithms for Message-Passing Parallel Programs
Debugging parallel message-passing programs is complicated by the non-determinism that is inherent in those programs. Cyclical debugging, which is a proven method for sequential programs, often fails when debugging parallel programs because different executions of the same program may exhibit different behaviors due to non-determinism. Some approaches have been studied to remedy this problem. W...
متن کاملAn Efficient Logical Clock for Replaying Message-Passing Programs
Cyclic debugging is one of the most important and most commonly used activities in programs development. During cyclic debugging, the program is repeatedly re-executed to track down errors when a failure has been observed. The cyclic debugging approach often fails for parallel programs because parallel programs reveal nondeterministic characteristics due to message race conditions. Execution re...
متن کاملUsing XML to Specify a Trace Format for MPI Programs
Trace files have long been used to assist correctness debugging and performance debugging of parallel programs. With the advent of implementations of the Message Passing Interface (MPI) standard, parallel and distributed computing has become more common, and thus the need for quality debugging tools has increased. It is important that trace file formats be extensible, flexible, and architectura...
متن کاملOptimal Run-Time Tracing of Message-Passing Programs
The widespread adoption of distributed computing has accentuated the need for an e ective set of support tools to facilitate debugging and monitoring of distributed programs. Unfortunately for distributed programs, this is not a trivial task. Many distributed programs are inherently non-deterministic in nature. Two runs of the same programs with the same input data may not result in the same ex...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998